-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tokenize emoji as if they were valid identifiers #88781
Conversation
r? @oli-obk (rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
cc @Manishearth |
LL | fn i_like_to_😅_a_lot() -> 👀 { | ||
| ^^^^^^^^^^^^^^^^^^ | ||
|
||
error: identifiers cannot contain emoji: `ABig👩👩👧👧Family` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣 the family got deaggregated
@bors r+ rollup |
📌 Commit 5979ed5 has been approved by |
Tokenize emoji as if they were valid identifiers In the lexer, consider emojis to be valid identifiers and reject them later to avoid knock down parse errors. Partially address rust-lang#86102.
/// Places where identifiers that contain invalid Unicode codepoints but that look like they | ||
/// should be. Useful to avoid bad tokenization when encountering emoji. We group them to | ||
/// provide a single error per unique incorrect identifier. | ||
pub bad_unicode_identifiers: Lock<FxHashMap<Symbol, Vec<Span>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub bad_unicode_identifiers: Lock<FxHashMap<Symbol, Vec<Span>>>, | |
pub bad_unicode_identifiers: Lock<VecMap<Symbol, Vec<Span>>>, |
Hash maps are great, until they aren't XD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried using it's API, but it really doesn't fit well with this case. I'm sorting the output before emitting the diagnostic instead.
5979ed5
to
30f9807
Compare
@bors r=oli-obk |
📌 Commit 30f9807 has been approved by |
Tokenize emoji as if they were valid identifiers In the lexer, consider emojis to be valid identifiers and reject them later to avoid knock down parse errors. Partially address rust-lang#86102.
Tokenize emoji as if they were valid identifiers In the lexer, consider emojis to be valid identifiers and reject them later to avoid knock down parse errors. Partially address rust-lang#86102.
This change is needed to land rust-lang/rust#88781, as it changes the handling of emojis in source code to treat them as identifiers.
Change test to not trigger emoji error This change is needed to land rust-lang/rust#88781, as it changes the handling of emojis in source code to treat them as identifiers.
Once rust-lang/cargo#10117 is pulled into this repo, we can merge this PR. |
@estebank Feel free to update cargo in this PR ( |
@bors r=oli-obk |
📌 Commit d929164 has been approved by |
⌛ Testing commit d929164 with merge 9dc1d2de4eb6199a77eedd3b61e644919794b59c... |
The job Click to see the possible cause of the failure (guessed by this bot)
|
💔 Test failed - checks-actions |
☀️ Test successful - checks-actions |
Finished benchmarking commit (23a4366): comparison url. Summary: This benchmark run did not return any relevant changes. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression |
In the lexer, consider emojis to be valid identifiers and reject
them later to avoid knock down parse errors.
Partially address #86102.